Paper Read
Machine Learning Systems
- Puzzle: Efficiently Aligning Large Language Models through Light-Weight Context Switch
- Centauri: Enabling Efficient Scheduling for Communication-Computation Overlap in Large Model Training via Communication Partitioning
- Slapo: A schedule language for progressive optimization of large deep learning model training
- Lancet: Accelerating Mixture-of-Experts Training via Whole Graph Computation-Communication Overlapping
- Optimus Accelerating Large-Scale Multi-Modal LLM Training by Bubble Exploitation
- ASPEN Breaking Operator Barriers for Efficient Parallel Execution of Deep Neural Networks
- Breaking the computation and communication abstraction barrier in distributed machine learning workloads
- Alpa: Automating inter-and Intra-Operator parallelism for distributed deep learning
- Nimble: Lightweight and parallel gpu task scheduling for deep learning
Page views: